1,301 research outputs found

    Vision based estimation, localization, and mapping for autonomous vehicles

    Get PDF
    In this dissertation, we focus on developing simultaneous localization and mapping (SLAM) algorithms with a robot-centric estimation framework primarily using monocular vision sensors. A primary contribution of this work is to use a robot-centric mapping framework concurrently with a world-centric localization method. We exploit the differential equation of motion of the normalized pixel coordinates of each point feature in the robot body frame. Another contribution of our work is to exploit a multiple-view geometry formulation with initial and current view projection of point features. We extract the features from objects surrounding the river and their reflections. The correspondences of the features are used along with the attitude and altitude information of the robot. We demonstrate that the observability of the estimation system is improved by applying our robot-centric mapping framework and multiple-view measurements. Using the robot-centric mapping framework and multiple-view measurements including reflection of features, we present a vision based localization and mapping algorithm that we developed for an unmanned aerial vehicle (UAV) flying in a riverine environment. Our algorithm estimates the 3D positions of point features along a river and the pose of the UAV. Our UAV is equipped with a lightweight monocular camera, an inertial measurement unit (IMU), a magnetometer, an altimeter, and an onboard computer. To our knowledge, we report the first result that exploits the reflections of features in a riverine environment for localization and mapping. We also present an omnidirectional vision based localization and mapping system for a lawn mowing robot. Our algorithm can detect whether the robotic mower is contained in a permitted area. Our robotic mower is modified with an omnidirectional camera, an IMU, a magnetometer, and a vehicle speed sensor. Here, we also exploit the robot-centric mapping framework. The estimator in our system generates a 3D point based map with landmarks. Concurrently, the estimator defines a boundary of the mowing area by using the estimated trajectory of the mower. The estimated boundary and the landmark map are provided for the estimation of the mowing location and for the containment detection. First, we derive a nonlinear observer with contraction analysis and pseudo-measurements of the depth of each landmark to prevent the map estimator from diverging. Of particular interest for this work is ensuring that the estimator for localization and mapping will not fail due to the nonlinearity of the system model. For batch estimation, we design a hybrid extended Kalman smoother for our localization and robot-centric mapping model. Finally, we present a single camera based SLAM algorithm using a convex optimization based nonlinear estimator. We validate the effectiveness of our algorithms through numerical simulations and outdoor experiments

    A New Spectral Method in Time Series Analysis

    Get PDF
    In this dissertation, we propose a new spectral method that could be used to overcome two issues in time series analysis. The first issue is the small sample problem. The periodogram is widely used to analyze second order stationary time series, since an expectation of the periodogram is approximately equal to the underlying spectral density of the time series. However, it is well known that the periodogram suffers from a finite sample bias. We show that the bias arises because of the finite boundary of observation in the discrete Fourier transforms (DFT), which is used in the construction of the periodogram. Moreover, we show that by using the best linear predictors of the time series outside the observed domain, we can obtain the “complete periodogram" that is an unbiased estimator of the spectral density. We propose a method for estimating the best linear predictors and prove, both theoretically and empirically, that the resulting estimated complete periodogram has a smaller bias than the regular periodogram. The estimated complete periodogram can be used to estimate parameters, which is expressed as a weighted sum of the spectral density. The second issue is the discrepancy between time and frequency domain methods in parameter estimation. In time series analysis, there is a clear distinction between the two domain methods. We draw connections between two domain methods by deriving an exact and interpretable bound between the Gaussian and Whittle likelihood of a second order stationary time series. The derivation is based on obtaining the transformation, which is biorthogonal to the DFT of the time series. Such a transformation yields a new decomposition for the inverse of a Toeplitz matrix and enables the representation of the Gaussian likelihood within the frequency domain. Based on this result, we obtain an approximation for the difference between the Gaussian and Whittle likelihoods and define two new frequency domain quasi-likelihood criteria. We show that these new criteria are computationally fast and yield a better approximation of the spectral divergence criterion, as compared to both the Gaussian and Whittle likelihoods

    Unified Contrastive Fusion Transformer for Multimodal Human Action Recognition

    Full text link
    Various types of sensors have been considered to develop human action recognition (HAR) models. Robust HAR performance can be achieved by fusing multimodal data acquired by different sensors. In this paper, we introduce a new multimodal fusion architecture, referred to as Unified Contrastive Fusion Transformer (UCFFormer) designed to integrate data with diverse distributions to enhance HAR performance. Based on the embedding features extracted from each modality, UCFFormer employs the Unified Transformer to capture the inter-dependency among embeddings in both time and modality domains. We present the Factorized Time-Modality Attention to perform self-attention efficiently for the Unified Transformer. UCFFormer also incorporates contrastive learning to reduce the discrepancy in feature distributions across various modalities, thus generating semantically aligned features for information fusion. Performance evaluation conducted on two popular datasets, UTD-MHAD and NTU RGB+D, demonstrates that UCFFormer achieves state-of-the-art performance, outperforming competing methods by considerable margins

    Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding

    Full text link
    Inspired by the impressive performance of recent face image editing methods, several studies have been naturally proposed to extend these methods to the face video editing task. One of the main challenges here is temporal consistency among edited frames, which is still unresolved. To this end, we propose a novel face video editing framework based on diffusion autoencoders that can successfully extract the decomposed features - for the first time as a face video editing model - of identity and motion from a given video. This modeling allows us to edit the video by simply manipulating the temporally invariant feature to the desired direction for the consistency. Another unique strength of our model is that, since our model is based on diffusion models, it can satisfy both reconstruction and edit capabilities at the same time, and is robust to corner cases in wild face videos (e.g. occluded faces) unlike the existing GAN-based methods.Comment: CVPR 2023. Our project page: https://diff-video-ae.github.i

    Pixel data real time processing as a next step for HL-LHC upgrades and beyond

    Full text link
    The experiments at LHC are implementing novel and challenging detector upgrades for the High Luminosity LHC, among which the tracking systems. This paper reports on performance studies, illustrated by an electron trigger, using a simplified pixel tracker. To achieve a real-time trigger (e.g. processing HL-LHC collision events at 40 MHz), simple algorithms are developed for reconstructing pixel-based tracks and track isolation, utilizing look-up tables based on pixel detector information. Significant gains in electron trigger performance are seen when pixel detector information is included. In particular, a rate reduction up to a factor of 20 is obtained with a signal selection efficiency of more than 95\% over the whole η\eta coverage of this detector. Furthermore, it reconstructs p-p collision points in the beam axis (z) direction, with a high precision of 20 μ\mum resolution in the very central region (η<0.8|\eta| < 0.8), and, up to 380 μ\mum in the forward region (2.7 <η<< |\eta| < 3.0). This study as well as the results can easily be adapted to the muon case and to the different tracking systems at LHC and other machines beyond the HL-LHC. The feasibility of such a real-time processing of the pixel information is mainly constrained by the Level-1 trigger latency of the experiment. How this might be overcome by the Front-End ASIC design, new processors and embedded Artificial Intelligence algorithms is briefly tackled as well.Comment: To be submitted to JHE

    Corporate Strategies in the Smartphone Era : The Case of Garmin Ltd.

    Get PDF

    Survey of Public Attitudes toward the Secondary Use of Public Healthcare Data in Korea

    Get PDF
    Objectives Public healthcare data have become crucial to the advancement of medicine, and recent changes in legal structure on privacy protection have expanded access to these data with pseudonymization. Recent debates on public healthcare data use by private insurance companies have shown large discrepancies in perceptions among the general public, healthcare professionals, private companies, and lawmakers. This study examined public attitudes toward the secondary use of public data, focusing on differences between public and private entities. Methods An online survey was conducted from January 11 to 24, 2022, involving a random sample of adults between 19 and 65 of age in 17 provinces, guided by the August 2021 census. Results The final survey analysis included 1,370 participants. Most participants were aware of health data collection (72.5%) and recent changes in legal structures (61.4%) but were reluctant to share their pseudonymized raw data (51.8%). Overall, they were favorable toward data use by public agencies but disfavored use by private entities, notably marketing and private insurance companies. Concerns were frequently noted regarding commercial use of data and data breaches. Among the respondents, 50.9% were negative about the use of public healthcare data by private insurance companies, 22.9% favored this use, and 1.9% were “very positive.” Conclusions This survey revealed a low understanding among key stakeholders regarding digital health data use, which is hindering the realization of the full potential of public healthcare data. This survey provides a basis for future policy developments and advocacy for the secondary use of health data
    corecore